AITopics | difference model

Collaborating Authors

difference model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Offline RLHF Methods Need More Accurate Supervision Signals

Wang, Shiqi, Zhang, Zhengze, Zhao, Rui, Tan, Fei, Nguyen, Cam Tu

arXiv.org Artificial IntelligenceAug-18-2024

With the rapid advances in Large Language Models (LLMs), aligning LLMs with human preferences become increasingly important. Although Reinforcement Learning with Human Feedback (RLHF) proves effective, it is complicated and highly resource-intensive. As such, offline RLHF has been introduced as an alternative solution, which directly optimizes LLMs with ranking losses on a fixed preference dataset. Current offline RLHF only captures the ``ordinal relationship'' between responses, overlooking the crucial aspect of ``how much'' one is preferred over the others. To address this issue, we propose a simple yet effective solution called \textbf{R}eward \textbf{D}ifference \textbf{O}ptimization, shorted as \textbf{RDO}. Specifically, we introduce {\it reward difference coefficients} to reweigh sample pairs in offline RLHF. We then develop a {\it difference model} involving rich interactions between a pair of responses for predicting these difference coefficients. Experiments with 7B LLMs on the HH and TL;DR datasets substantiate the effectiveness of our method in both automatic metrics and human evaluation, thereby highlighting its potential for aligning LLMs with human intent and values.

coefficient, difference model, reward model, (15 more...)

arXiv.org Artificial Intelligence

2408.09385

Country:

Asia > China > Jiangsu Province > Nanjing (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Algorithmically Designed Artificial Neural Networks (ADANNs): Higher order deep operator learning for parametric partial differential equations

Jentzen, Arnulf, Riekert, Adrian, von Wurstemberger, Philippe

arXiv.org Machine LearningFeb-7-2023

Deep learning approximation methods - usually consisting of deep artificial neural networks (ANN) trained through stochastic gradient descent (SGD) optimization methods - belong nowadays to the most heavily employed approximation methods in the digital world. The striking feature of deep learning methods is that in many situations numerical simulations suggest that the computational effort of such methods seem to grow only at most polynomially in the input dimension d N = {1, 2, 3,... } of the problem under consideration. In contrast, classical numerical methods usually suffer under the so-called curse of dimensionality (cf., e.g., Bellman [4], Novak & Wozniakowski [37, Chapter 1], and Novak & Wozniakowski [38, Chapter 9]) in the sense that the computational effort grows at least exponentially in the dimension. In the recent years, deep learning technologies have also been intensively used to attack problems from scientific computing such as the numerical solutions of partial differential equations (PDEs). In particular, deep learning approximation methods have been used to approximately solve high-dimensional nonlinear PDEs (see, e.g., [2,5,10,11,14,16,25,42] and the references mentioned therein) such as high-dimensional nonlinear pricing problems from financial engineering and Hamiltonian-Jacobi-Bellman equations from optimal control. In the context of such highdimensional nonlinear PDEs, the progress of deep learning approximation methods is obvious as there are - except of in some special cases (see, e.g., [19, 20, 36] and the references therein for Branching type methods and see, e.g., [11-13, 22] and the references therein for multilevel Picard methods) - essentially no alternative numerical approximation methods which are capable of solving such high-dimensional nonlinear PDEs. There is nowadays also a huge literature on deep learning approximation methods for lowdimensional PDEs (cf., e.g., [24, 41]).

artificial intelligence, base model, machine learning, (17 more...)

arXiv.org Machine Learning

2302.03286

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States (0.14)
Asia > China > Guangdong Province > Shenzhen (0.04)
(2 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Torus Graphs for Multivariate Phase Coupling Analysis

Klein, Natalie, Orellana, Josue, Brincat, Scott, Miller, Earl K., Kass, Robert E.

arXiv.org Machine LearningOct-24-2019

Angular measurements are often modeled as circular random variables, where there are natural circular analogues of moments, including correlation. Because a product of circles is a torus, a d-dimensional vector of circular random variables lies on a d-dimensional torus. For such vectors we present here a class of graphical models, which we call torus graphs, based on the full exponential family with pairwise interactions. The topological distinction between a torus and Euclidean space has several important consequences. Our development was motivated by the problem of identifying phase coupling among oscillatory signals recorded from multiple electrodes in the brain: oscillatory phases across electrodes might tend to advance or recede together, indicating coordination across brain areas. The data analyzed here consisted of 24 phase angles measured repeatedly across 840 experimental trials (replications) during a memory task, where the electrodes were in 4 distinct brain regions, all known to be active while memories are being stored or retrieved. In realistic numerical simulations, we found that a standard pairwise assessment, known as phase locking value, is unable to describe multivariate phase interactions, but that torus graphs can accurately identify conditional associations. Torus graphs generalize several more restrictive approaches that have appeared in various scientific literatures, and produced intuitive results in the data we analyzed. Torus graphs thus unify multivariate analysis of circular data and present fertile territory for future research.

graph, phase difference, torus graph, (12 more...)

arXiv.org Machine Learning

1910.11044

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)

Add feedback

Socratic Learning: Augmenting Generative Models to Incorporate Latent Subsets in Training Data

Varma, Paroma, He, Bryan, Iter, Dan, Xu, Peng, Yu, Rose, De Sa, Christopher, Ré, Christopher

arXiv.org Machine LearningSep-28-2017

A challenge in training discriminative models like neural networks is obtaining enough labeled training data. Recent approaches use generative models to combine weak supervision sources, like user-defined heuristics or knowledge bases, to label training data. Prior work has explored learning accuracies for these sources even without ground truth labels, but they assume that a single accuracy parameter is sufficient to model the behavior of these sources over the entire training set. In particular, they fail to model latent subsets in the training data in which the supervision sources perform differently than on average. We present Socratic learning, a paradigm that uses feedback from a corresponding discriminative model to automatically identify these subsets and augments the structure of the generative model accordingly. Experimentally, we show that without any ground truth labels, the augmented generative model reduces error by up to 56.06% for a relation extraction task compared to a state-of-the-art weak supervision technique that utilizes generative models.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

1610.08123

Country:

North America > United States > California (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Japan > Shikoku > Kagawa Prefecture > Takamatsu (0.04)

Genre: Research Report (0.82)

Industry:

Health & Medicine (1.00)
Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)

Add feedback